home *** CD-ROM | disk | FTP | other *** search
-
-
-
- PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV ((((RRRReeeelllleeeeaaaasssseeee 0000....0000 PPPPaaaattttcccchhhhlllleeeevvvveeeellll 00000000)))) PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111))))
-
-
-
- NNNNAAAAMMMMEEEE
- perldata - Perl data structures
-
- DDDDEEEESSSSCCCCRRRRIIIIPPPPTTTTIIIIOOOONNNN
- VVVVaaaarrrriiiiaaaabbbblllleeee nnnnaaaammmmeeeessss
-
- Perl has three data structures: scalars, arrays of scalars,
- and associative arrays of scalars, known as "hashes".
- Normal arrays are indexed by number, starting with 0.
- (Negative subscripts count from the end.) Hash arrays are
- indexed by string.
-
- Scalar values are always named with '$', even when referring
- to a scalar that is part of an array. It works like the
- English word "the". Thus we have:
-
- $days # the simple scalar value "days"
- $days[28] # the 29th element of array @days
- $days{'Feb'} # the 'Feb' value from hash %days
- $#days # the last index of array @days
-
- but entire arrays or array slices are denoted by '@', which
- works much like the word "these" or "those":
-
- @days # ($days[0], $days[1],... $days[n])
- @days[3,4,5] # same as @days[3..5]
- @days{'a','c'} # same as ($days{'a'},$days{'c'})
-
- and entire hashes are denoted by '%':
-
- %days # (key1, val1, key2, val2 ...)
-
- In addition, subroutines are named with an initial '&',
- though this is optional when it's otherwise unambiguous
- (just as "do" is often redundant in English). Symbol table
- entries can be named with an initial '*', but you don't
- really care about that yet.
-
- Every variable type has its own namespace. You can, without
- fear of conflict, use the same name for a scalar variable,
- an array, or a hash (or, for that matter, a filehandle, a
- subroutine name, or a label). This means that $foo and @foo
- are two different variables. It also means that $foo[1] is
- a part of @foo, not a part of $foo. This may seem a bit
- weird, but that's okay, because it is weird.
-
- Since variable and array references always start with '$',
- '@', or '%', the "reserved" words aren't in fact reserved
- with respect to variable names. (They ARE reserved with
- respect to labels and filehandles, however, which don't have
- an initial special character. You can't have a filehandle
- named "log", for instance. Hint: you could say
-
-
-
- Page 1 (printed 6/30/95)
-
-
-
-
-
-
- PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV ((((RRRReeeelllleeeeaaaasssseeee 0000....0000 PPPPaaaattttcccchhhhlllleeeevvvveeeellll 00000000)))) PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111))))
-
-
-
- open(LOG,'logfile') rather than open(log,'logfile'). Using
- uppercase filehandles also improves readability and protects
- you from conflict with future reserved words.) Case _I_S
- significant--"FOO", "Foo" and "foo" are all different names.
- Names that start with a letter or underscore may also
- contain digits and underscores.
-
- It is possible to replace such an alphanumeric name with an
- expression that returns a reference to an object of that
- type. For a description of this, see the _p_e_r_l_r_e_f manpage.
-
- Names that start with a digit may only contain more digits.
- Names which do not start with a letter, underscore, or
- digit are limited to one character, e.g. "$%" or "$$".
- (Most of these one character names have a predefined
- significance to Perl. For instance, $$ is the current
- process id.)
-
- CCCCoooonnnntttteeeexxxxtttt
-
- The interpretation of operations and values in Perl
- sometimes depends on the requirements of the context around
- the operation or value. There are two major contexts:
- scalar and list. Certain operations return list values in
- contexts wanting a list, and scalar values otherwise. (If
- this is true of an operation it will be mentioned in the
- documentation for that operation.) In other words, Perl
- overloads certain operations based on whether the expected
- return value is singular or plural. (Some words in English
- work this way, like "fish" and "sheep".)
-
- In a reciprocal fashion, an operation provides either a
- scalar or a list context to each of its arguments. For
- example, if you say
-
- int( <STDIN> )
-
- the integer operation provides a scalar context for the
- <STDIN> operator, which responds by reading one line from
- STDIN and passing it back to the integer operation, which
- will then find the integer value of that line and return
- that. If, on the other hand, you say
-
- sort( <STDIN> )
-
- then the sort operation provides a list context for <STDIN>,
- which will proceed to read every line available up to the
- end of file, and pass that list of lines back to the sort
- routine, which will then sort those lines and return them as
- a list to whatever the context of the sort was.
-
- Assignment is a little bit special in that it uses its left
-
-
-
- Page 2 (printed 6/30/95)
-
-
-
-
-
-
- PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV ((((RRRReeeelllleeeeaaaasssseeee 0000....0000 PPPPaaaattttcccchhhhlllleeeevvvveeeellll 00000000)))) PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111))))
-
-
-
- argument to determine the context for the right argument.
- Assignment to a scalar evaluates the righthand side in a
- scalar context, while assignment to an array or array slice
- evaluates the righthand side in a list context. Assignment
- to a list also evaluates the righthand side in a list
- context.
-
- User defined subroutines may choose to care whether they are
- being called in a scalar or list context, but most
- subroutines do not need to care, because scalars are
- automatically interpolated into lists. See the wantarray
- entry in the _p_e_r_l_f_u_n_c manpage.
-
- SSSSccccaaaallllaaaarrrr vvvvaaaalllluuuueeeessss
-
- Scalar variables may contain various kinds of singular data,
- such as numbers, strings and references. In general,
- conversion from one form to another is transparent. (A
- scalar may not contain multiple values, but may contain a
- reference to an array or hash containing multiple values.)
- Because of the automatic conversion of scalars, operations
- and functions that return scalars don't need to care (and,
- in fact, can't care) whether the context is looking for a
- string or a number.
-
- A scalar value is interpreted as TRUE in the Boolean sense
- if it is not the null string or the number 0 (or its string
- equivalent, "0"). The Boolean context is just a special
- kind of scalar context.
-
- There are actually two varieties of null scalars: defined
- and undefined. Undefined null scalars are returned when
- there is no real value for something, such as when there was
- an error, or at end of file, or when you refer to an
- uninitialized variable or element of an array. An undefined
- null scalar may become defined the first time you use it as
- if it were defined, but prior to that you can use the
- _d_e_f_i_n_e_d() operator to determine whether the value is defined
- or not.
-
- The length of an array is a scalar value. You may find the
- length of array @days by evaluating $#days, as in ccccsssshhhh.
- (Actually, it's not the length of the array, it's the
- subscript of the last element, since there is (ordinarily) a
- 0th element.) Assigning to $#days changes the length of the
- array. Shortening an array by this method destroys
- intervening values. Lengthening an array that was
- previously shortened _N_O _L_O_N_G_E_R recovers the values that were
- in those elements. (It used to in Perl 4, but we had to
- break this make to make sure destructors were called when
- expected.) You can also gain some measure of efficiency by
- preextending an array that is going to get big. (You can
-
-
-
- Page 3 (printed 6/30/95)
-
-
-
-
-
-
- PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV ((((RRRReeeelllleeeeaaaasssseeee 0000....0000 PPPPaaaattttcccchhhhlllleeeevvvveeeellll 00000000)))) PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111))))
-
-
-
- also extend an array by assigning to an element that is off
- the end of the array.) You can truncate an array down to
- nothing by assigning the null list () to it. The following
- are equivalent:
-
- @whatever = ();
- $#whatever = $[ - 1;
-
- If you evaluate a named array in a scalar context, it
- returns the length of the array. (Note that this is not
- true of lists, which return the last value, like the C comma
- operator.) The following is always true:
-
- scalar(@whatever) == $#whatever - $[ + 1;
-
- Version 5 of Perl changed the semantics of $[: files that
- don't set the value of $[ no longer need to worry about
- whether another file changed its value. (In other words,
- use of $[ is deprecated.) So in general you can just assume
- that
-
- scalar(@whatever) == $#whatever + 1;
-
- If you evaluate a hash in a scalar context, it returns a
- value which is true if and only if the hash contains any
- key/value pairs. (If there are any key/value pairs, the
- value returned is a string consisting of the number of used
- buckets and the number of allocated buckets, separated by a
- slash. This is pretty much only useful to find out whether
- Perl's (compiled in) hashing algorithm is performing poorly
- on your data set. For example, you stick 10,000 things in a
- hash, but evaluating %HASH in scalar context reveals "1/16",
- which means only one out of sixteen buckets has been
- touched, and presumably contains all 10,000 of your items.
- This isn't supposed to happen.)
-
- SSSSccccaaaallllaaaarrrr vvvvaaaalllluuuueeee ccccoooonnnnssssttttrrrruuuuccccttttoooorrrrssss
-
- Numeric literals are specified in any of the customary
- floating point or integer formats:
-
- 12345
- 12345.67
- .23E-10
- 0xffff # hex
- 0377 # octal
- 4_294_967_296 # underline for legibility
-
- String literals are delimited by either single or double
- quotes. They work much like shell quotes: double-quoted
- string literals are subject to backslash and variable
- substitution; single-quoted strings are not (except for "\'"
-
-
-
- Page 4 (printed 6/30/95)
-
-
-
-
-
-
- PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV ((((RRRReeeelllleeeeaaaasssseeee 0000....0000 PPPPaaaattttcccchhhhlllleeeevvvveeeellll 00000000)))) PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111))))
-
-
-
- and "\\"). The usual Unix backslash rules apply for making
- characters such as newline, tab, etc., as well as some more
- exotic forms. See the qq entry in the _p_e_r_l_o_p manpage for a
- list.
-
- You can also embed newlines directly in your strings, i.e.
- they can end on a different line than they begin. This is
- nice, but if you forget your trailing quote, the error will
- not be reported until Perl finds another line containing the
- quote character, which may be much further on in the script.
- Variable substitution inside strings is limited to scalar
- variables, arrays, and array slices. (In other words,
- identifiers beginning with $ or @, followed by an optional
- bracketed expression as a subscript.) The following code
- segment prints out "The price is $100."
-
- $Price = '$100'; # not interpreted
- print "The price is $Price.\n"; # interpreted
-
- As in some shells, you can put curly brackets around the
- identifier to delimit it from following alphanumerics. Also
- note that a single-quoted string must be separated from a
- preceding word by a space, since single quote is a valid
- (though discouraged) character in an identifier (see the
- Packages entry in the _p_e_r_l_m_o_d manpage).
-
- Two special literals are __LINE__ and __FILE__, which
- represent the current line number and filename at that point
- in your program. They may only be used as separate tokens;
- they will not be interpolated into strings. In addition,
- the token __END__ may be used to indicate the logical end of
- the script before the actual end of file. Any following
- text is ignored, but may be read via the DATA filehandle.
- (The DATA filehandle may read data only from the main
- script, but not from any required file or evaluated string.)
- The two control characters ^D and ^Z are synonyms for
- __END__.
-
- A word that doesn't have any other interpretation in the
- grammar will be treated as if it were a quoted string.
- These are known as "barewords". As with filehandles and
- labels, a bareword that consists entirely of lowercase
- letters risks conflict with future reserved words, and if
- you use the ----wwww switch, Perl will warn you about any such
- words. Some people may wish to outlaw barewords entirely.
- If you say
-
- use strict 'subs';
-
- then any bareword that would NOT be interpreted as a
- subroutine call produces a compile-time error instead. The
- restriction lasts to the end of the enclosing block. An
-
-
-
- Page 5 (printed 6/30/95)
-
-
-
-
-
-
- PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV ((((RRRReeeelllleeeeaaaasssseeee 0000....0000 PPPPaaaattttcccchhhhlllleeeevvvveeeellll 00000000)))) PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111))))
-
-
-
- inner block may countermand this by saying no strict 'subs'.
-
- Array variables are interpolated into double-quoted strings
- by joining all the elements of the array with the delimiter
- specified in the $" variable, space by default. The
- following are equivalent:
-
- $temp = join($",@ARGV);
- system "echo $temp";
-
- system "echo @ARGV";
-
- Within search patterns (which also undergo double-quotish
- substitution) there is a bad ambiguity: Is /$foo[bar]/ to
- be interpreted as /${foo}[bar]/ (where [bar] is a character
- class for the regular expression) or as /${foo[bar]}/ (where
- [bar] is the subscript to array @foo)? If @foo doesn't
- otherwise exist, then it's obviously a character class. If
- @foo exists, Perl takes a good guess about [bar], and is
- almost always right. If it does guess wrong, or if you're
- just plain paranoid, you can force the correct
- interpretation with curly brackets as above.
-
- A line-oriented form of quoting is based on the shell
- "here-doc" syntax. Following a << you specify a string to
- terminate the quoted material, and all lines following the
- current line down to the terminating string are the value of
- the item. The terminating string may be either an
- identifier (a word), or some quoted text. If quoted, the
- type of quotes you use determines the treatment of the text,
- just as in regular quoting. An unquoted identifier works
- like double quotes. There must be no space between the <<
- and the identifier. (If you put a space it will be treated
- as a null identifier, which is valid, and matches the first
- blank line--see the Merry Christmas example below.) The
- terminating string must appear by itself (unquoted and with
- no surrounding whitespace) on the terminating line.
-
- print <<EOF; # same as above
- The price is $Price.
- EOF
-
- print <<"EOF"; # same as above
- The price is $Price.
- EOF
-
- print << x 10; # Legal but discouraged. Use <<"".
- Merry Christmas!
-
-
-
-
-
-
-
- Page 6 (printed 6/30/95)
-
-
-
-
-
-
- PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV ((((RRRReeeelllleeeeaaaasssseeee 0000....0000 PPPPaaaattttcccchhhhlllleeeevvvveeeellll 00000000)))) PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111))))
-
-
-
- print <<`EOC`; # execute commands
- echo hi there
- echo lo there
- EOC
-
- print <<"foo", <<"bar"; # you can stack them
- I said foo.
- foo
- I said bar.
- bar
-
- myfunc(<<"THIS", 23, <<'THAT'');
- Here's a line
- or two.
- THIS
- and here another.
- THAT
-
- Just don't forget that you have to put a semicolon on the
- end to finish the statement, as Perl doesn't know you're not
- going to try to do this:
-
- print <<ABC
- 179231
- ABC
- + 20;
-
-
- LLLLiiiisssstttt vvvvaaaalllluuuueeee ccccoooonnnnssssttttrrrruuuuccccttttoooorrrrssss
-
- List values are denoted by separating individual values by
- commas (and enclosing the list in parentheses where
- precedence requires it):
-
- (LIST)
-
- In a context not requiring an list value, the value of the
- list literal is the value of the final element, as with the
- C comma operator. For example,
-
- @foo = ('cc', '-E', $bar);
-
- assigns the entire list value to array foo, but
-
- $foo = ('cc', '-E', $bar);
-
- assigns the value of variable bar to variable foo. Note
- that the value of an actual array in a scalar context is the
- length of the array; the following assigns to $foo the value
- 3:
-
-
-
-
-
- Page 7 (printed 6/30/95)
-
-
-
-
-
-
- PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV ((((RRRReeeelllleeeeaaaasssseeee 0000....0000 PPPPaaaattttcccchhhhlllleeeevvvveeeellll 00000000)))) PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111))))
-
-
-
- @foo = ('cc', '-E', $bar);
- $foo = @foo; # $foo gets 3
-
- You may have an optional comma before the closing
- parenthesis of an list literal, so that you can say:
-
- @foo = (
- 1,
- 2,
- 3,
- );
-
- LISTs do automatic interpolation of sublists. That is, when
- a LIST is evaluated, each element of the list is evaluated
- in a list context, and the resulting list value is
- interpolated into LIST just as if each individual element
- were a member of LIST. Thus arrays lose their identity in a
- LIST--the list
-
- (@foo,@bar,&SomeSub)
-
- contains all the elements of @foo followed by all the
- elements of @bar, followed by all the elements returned by
- the subroutine named SomeSub. To make a list reference that
- does _N_O_T interpolate, see the _p_e_r_l_r_e_f manpage.
-
- The null list is represented by (). Interpolating it in a
- list has no effect. Thus ((),(),()) is equivalent to ().
- Similarly, interpolating an array with no elements is the
- same as if no array had been interpolated at that point.
-
- A list value may also be subscripted like a normal array.
- You must put the list in parentheses to avoid ambiguity.
- Examples:
-
- # Stat returns list value.
- $time = (stat($file))[8];
-
- # Find a hex digit.
- $hexdigit = ('a','b','c','d','e','f')[$digit-10];
-
- # A "reverse comma operator".
- return (pop(@foo),pop(@foo))[0];
-
- Lists may be assigned to if and only if each element of the
- list is legal to assign to:
-
- ($a, $b, $c) = (1, 2, 3);
-
- ($map{'red'}, $map{'blue'}, $map{'green'}) = (0x00f, 0x0f0, 0xf00);
-
- The final element may be an array or a hash:
-
-
-
- Page 8 (printed 6/30/95)
-
-
-
-
-
-
- PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111)))) UUUUNNNNIIIIXXXX SSSSyyyysssstttteeeemmmm VVVV ((((RRRReeeelllleeeeaaaasssseeee 0000....0000 PPPPaaaattttcccchhhhlllleeeevvvveeeellll 00000000)))) PPPPEEEERRRRLLLLDDDDAAAATTTTAAAA((((1111))))
-
-
-
- ($a, $b, @rest) = split;
- local($a, $b, %rest) = @_;
-
- You can actually put an array anywhere in the list, but the
- first array in the list will soak up all the values, and
- anything after it will get a null value. This may be useful
- in a _l_o_c_a_l() or _m_y().
-
- A hash literal contains pairs of values to be interpreted as
- a key and a value:
-
- # same as map assignment above
- %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);
-
- It is often more readable to use the => operator between
- key/value pairs (the => operator is actually nothing more
- than a more visually distinctive synonym for a comma):
-
- %map = (
- 'red' => 0x00f,
- 'blue' => 0x0f0,
- 'green' => 0xf00,
- );
-
- Array assignment in a scalar context returns the number of
- elements produced by the expression on the right side of the
- assignment:
-
- $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2
-
- This is very handy when you want to do a list assignment in
- a Boolean context, since most list functions return a null
- list when finished, which when assigned produces a 0, which
- is interpreted as FALSE.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Page 9 (printed 6/30/95)
-
-
-
-